Estimation of Individual Micro Data from Aggregated Open Data

نویسندگان

  • Han-mook Yoo
  • Han-joon Kim
  • Jonghoon Chun
چکیده

In this paper, we propose a method of estimating individual micro data from aggregated open data based on semisupervised learning and conditional probability. Firstly, the proposed method collects aggregated open data and support data, which are related to the individual micro data to be estimated. Then, we perform the locality sensitive hashing (LSH) algorithm to find a subset of the support data that is similar to the aggregated open data and then classify them by using the Ensemble classification model, which is learned by semi-supervised learning. Finally, we use conditional probability to estimate the individual micro data by finding the most suitable record for the probability distribution of the individual micro data among the classification results. To evaluate the performance of the proposed method, we estimated the individual building data where the fire occurred using the aggregated fire open data. According to the experimental results, the micro data estimation performance of the proposed method is 59.41% on average in terms of accuracy. Keywords— locality sensitive hashing; semi-supervised learning; Ensemble; open data; conditional probability

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Estimation of Multiphase Relative Permeabilities in Reservoir Cores from Micro-CT Data

With significant increase of tomographic equipment power, demand for Prediction relative permeability prediction Predicting in porous media from digital image data. In this work, it is predicted three -phase relative permeabilities with co-applying Darcy’s and Stokes equations in two case studies, namely Bentheimer sandstone and Estaillades limestone which their micro-CT data files were downloa...

متن کامل

Close Following Behavior: Estimation of Desired Gap Headway Using Loop Detector Data (TECHNICAL NOTE)

The desired gap headway of drivers, while close following, represents the main parameter in determining the following distance between vehicles.  This paper uses the raw individual vehicles data taken from loop detectors for millions of vehicles used M25 and M42 in order to estimate the gap headway distributions between successive pairs of vehicles.  The data used in this paper were filtered so...

متن کامل

Sparse Parameter Recovery from Aggregated Data

Data aggregation is becoming an increasingly common technique for sharing sensitive information, and for reducing data size when storage and/or communication costs are high. Aggregate quantities such as group-average are a form of semi-supervision as they do not directly provide information of individual values, but despite their wide-spread use, prior literature on learning individual-level mo...

متن کامل

Spatio-Temporal Building Population Estimation for Highly Urbanized Areas Using GIS

Detailed population information is crucial for the micro-scale modeling and analysis of human behavior in urban areas. Since it is not available on the basis of individual persons, it has become necessary to derive data from aggregated census data. A variety of approaches have been published in the past, yet they are not entirely suitable for use in the micro-scale context of highly urbanized a...

متن کامل

An integrated analysis of individual and aggregated health data using estimating equations.

Analyses of individual disease-exposure data within a population are useful when exposure of interest varies sufficiently within the population. When the within-population variance of exposure is limited, however, power of the individual-data analysis is reduced. In such situations, aggregated-data analyses of disease data across populations, with a sample of individual exposure data from each ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1712.06802  شماره 

صفحات  -

تاریخ انتشار 2017